Finding evolutionarily conserved cis-regulatory modules with a universal set of motifs – Supplementary Materials
نویسندگان
چکیده
where P (w|M) is the probability of observing w given the motif model (drawn from the frequency matrix) and P (w|B) is the probability of observing w given the background model (estimated from the sequence). All subwords w satisfying LM (w) > tM are classified as M -occurrences. There are two standard approaches to the choice of the threshold tM [1]. The first one aims at restricting the number of false positive motif occurrences. For assumed type I error level α1, tM is chosen to satisfy P (LM (w) > tM |B) = α1. Its disadvantage is poor control on the classification of true M -occurrences. The second approach (setting tM satisfying P (LM (w) < tM |M) = α2 for assumed type II error level α2) restricts the number of false negatives. Unfortunately, it leads to the loss of control on the number of false positives, and consequently to significant disparity in the number of predicted instances of strong and weak motifs (i.e. motifs easily and hardly discriminated from the background). As it is explained in the main text, our method of CRM identification takes into account both positive and negative signals from the promoter sequence. Thus the control of both error types in the motif prediction has to be balanced, in the sense that the number of false positives should be of the order of the number of false negatives. Following the approach proposed by [1] we set the threshold tM satisfying the equation
منابع مشابه
TRES: comparative promoter sequence analysis
Comparative promoter analysis is a promising strategy for elucidation of common regulatory modules conserved in evolutionarily related sequences or in genes showing common expression profiles. To facilitate such analysis, we have developed a software tool that detects conserved transcription factor binding sites, cis-elements, palindromes and k-tuples simultaneously in a set of sequences, and t...
متن کاملCREME: a framework for identifying cis-regulatory modules in human-mouse conserved segments
MOTIVATION The binding of transcription factors to specific regulatory sequence elements is a primary mechanism for controlling gene transcription. Recent findings suggest a modular organization of binding sites for transcription factors that cooperate in the regulation of genes. In this work we establish a framework for finding recurrent cis-regulatory modules in the promoters of a selected se...
متن کاملUnraveling transcriptional control in Arabidopsis using cis-regulatory elements and coexpression networks.
Analysis of gene expression data generated by high-throughput microarray transcript profiling experiments has demonstrated that genes with an overall similar expression pattern are often enriched for similar functions. This guilt-by-association principle can be applied to define modular gene programs, identify cis-regulatory elements, or predict gene functions for unknown genes based on their c...
متن کاملUnraveling Transcriptional Control in Arabidopsis Using cis-Regulatory Elements and Coexpression Networks1[C][W]
Analysis of gene expression data generated by high-throughput microarray transcript profiling experiments has demonstrated that genes with an overall similar expression pattern are often enriched for similar functions. This guilt-by-association principle can be applied to define modular gene programs, identify cis-regulatory elements, or predict gene functions for unknown genes based on their c...
متن کاملPrediction of similarly-acting cis-regulatory modules by subsequence profiling and comparative genomics in D. melanogaster and D. pseudoobscura
Motivation: To date, computational searches for cis-regulatory modules (CRMs) have relied on two methods. The first, phylogenetic footprinting, has been used to find CRMs in non-coding sequence, but does not directly link DNA sequence with spatio-temporal patterns of expression. The second, based on searches for combinations of transcription factor (TF) binding motifs, has been employed in geno...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008